Normalization 




N or maliz ation 






• Why do we use Normalization? 

Figure 14.4 Example relations for the schemas in Figure 14.3 that 
result from applying natural join to the relations in Figure 14.2. These 

may be stored as base relations for performance reasons. 

EMP DEPT 


ENA ME 

SSN 

BDATE 

ADDRESS 

□NUMBER 

DNAME 

DMGRSSN 

Smith, John B.. 

1 23456789 

1 965-01-09 

731 Fondrcn,Houston,TX 

5 

Research 

333445555 

Wong, Franklin T. 

33344 5555 

1 955- 1 2-08 

638 Voss.HoustonTX 

5 

Research 

333445555 

Zelaya. Alicia J 

999887777 

1 968-07- 1 9 

3321 Castle, Spring,TX 

4 

Admin istration 

987654321 

Wallace .Jennifer S. 

987654321 

1941-0 6-20 

291 Berry. Bellaire, T X 

4 

Adm in istnatio n 

987654321 

Narayan.Ramesh K. 

606884444 

1 962-09- 1 5 

975 Fire Oak .Humble, TX 

5 

Research 

333445555 

English, Joyce A. 

453453453 

1972-07-31 

5631 Rice, Ho uston.TX 

5 

Research 

333445555 

JabbatAhmad V. 

987987987 

1 969-03-29 

980 Dallas.Houston.TX 

4 

Admin istratio n 

987654321 

Borg .James E. 

888665555 

1 937- 11-10 

450 Stone, Houston ,TX 

1 

Headq uarta rs 

888665555 


EIV1P PROJ 


SSN 

PNUMBER 

HOURS 

E NAM E 

PWAM E 

PLOCATION 

1 23456789 

1 

32.5 

Smith .John B. 

ProductX 


Bellaire 

1 23456789 

2 

7.5 

Smith .John B. 

ProductY 


Sugarland 

666884444 

3 

40.0 

Narayan , Ramesh K . 

ProductZ 


Houston 

453453453 

1 

20.0 

English .Joyce A. 

ProductX 


Bellaire 

453453453 

2 

20.0 

English.Joyce A. 

ProductY 


Sugarland 

333445555 

2 

10.0 

Wong. Franklin T. 

ProductY 


Sugarland 

333445555 

3 

10.0 

Wong. Franklin T. 

ProductZ 


Houston 

333445555 

10 

10.0 

Wong, Franklin T. 

Comp rite rizatio n 

Stafford 

333445555 

20 

10.0 

Wong. Franklin T. 

Reorganisation 


Houston 

999887777 

30 

30.0 

Zelaya, Alicia J. 

Newbenefits 


Stafford 

999887777 

10 

10.0 

Zelaya. Alicia J. 

Comp ute r izatio n 

Stafford 

987987987 

10 

35.0 

J ab bar. Ah rn ad V. 

Com p ute rization 

Stafford 

987987987 

30 

5.0 

Jab bar. Ah mad V. 

Newbenefits 


Stafford 

987654321 

30 

20.0 

Wallace, Jennifer S. 

Newbenefits 


Stafford 

987654321 

20 

1 5.0 

Wall ace .Jennifer S 

Reorganization 


Houston 

888665555 

20 

null 

Borg. James E. 

Reorganization 


Houston 
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N ormaliz ation Avoids 


• Duplication of Data 

• The same data is listed in multiple lines of the database 

• Insert Anomaly 

• - A record about an entity cannot be inserted into the table 
without first inserting information about another entity - 
Cannot enter a customer without a sales order 

• Delete Anomaly 

• Update Anomaly 

• Frequent Null Values 


Functional dependency 


• a constraint between two attributes (columns) or two 
sets of columns 

• A set of attributes A functionally determines a set of 
attributes B if the value of A determines a unique 
value for B 

• A -> B if “for every valid instance of A, that value of A 
uniquely determines the value of B” 

• or ...A ->B if “there exists at most one value of B for 
every value of A” 


. . functional dependency 


some examples 

social security number determines employee name 
SSN -> ENAME 

project number determines project name and location 
PNUMBER -> {PNAME, PLOCATION} 

employee ssn and project number determines the 
hours per week that the employee works on the 
project 

{SSN, PNUMBER} -> HOURS 

So functional dependency is the technical term 

for determines 


keys and dependencies 


EMPL0YEE1 (Emp_ID, Name, Dept_Name, Salary) 



determinant 


o 


O 



EmpJD 

Name 

Dept_Name 

Salary 


i 

L A 

. i 

nO 



functional 

dependency 


Partial Dependency 


Functional Dependency 

• The value of one attribute in a table is determined 
entirely by the value of another 

. Partial Dependency 

• A type of functional dependency where an attribute 
is functionally dependent on only part of the 
primary key (primary key must be a composite 
key). 


Partial Dependency 

EMPL0YEE2 (EmpJD, Course_Title, Name, Address, 

Salary, Date_Completed) 




EmpJD 

Course_ 

Title 

Name 

Address 

Salary 

Date_ 

Comp. 




O 


o 


o 


not fully functionally 
dependant on the primary 
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Attribute types 


• In ER Model attributes can be classified into the 
following types. 

• Simple and Composite Attribute 

• Single Valued and Multi Valued attribute 

• Stored and Derived Attributes 

• Complex Attribute 


Simple and Composite Attribute 


• Simple attribute that consist of a single atomic value. 
A composite attribute is an attribute that can be 
further subdivided. For example the attribute 
ADDRESS can be subdivided into street, city, state, 
and zip code. A simple attribute cannot be subdivided 
For example the attributes age, sex etc are simple 
attributes. 


• Simple Attribute: Attribute that consist of a single 
atomic value. 

Example: Salary, age etc 


• Composite Attribute : Attribute value not atomic. 
Example : Address : ‘Flouse_no:City:State 



Single Valued and Multi Valued attribute 

■ A single valued attribute can have only a single value. For 
example a person can have only one 'date of birth', 'age' etc. 
That is a single valued attributes can have only single value. But 
it can be simple or composite attribute.That is 'date of birth' is a 
composite attribute , 'age' is a simple attribute. But both are 
single valued attributes. 

■ Multivalued attributes can have multiple values. For instance a 
person may have multiple phone numbers, multiple degrees 
etc. Multivalued attributes are shown by a double line connecting 
to the entity in the ER diagram. 

■ Single Valued Attribute: Attribute that hold a single value 
Examplel : Age, City, Customer id 

■ Multi Valued Attribute: Attribute that hold multiple values. 
Examplel : A customer can have multiple phone numbers, email 
id's etc 

Example2: A person may have several college degrees 


Stored and Derived Attributes 

• The value for the derived attribute is derived from the stored 
attribute. For example 'Date of birth' of a person is a stored 
attribute. The value for the attribute 'AGE' can be derived by 
subtracting the 'Date of Birth'(DOB) from the current date. Stored 
attribute supplies a value to the related attribute. 

• Stored Attribute: An attribute that supplies a value to the related 
attribute. 

Example: Date of Birth 

• Derived Attribute: An attribute that’s value is derived from a 
stored attribute. 

Example : age, and it’s value is derived from the stored attribute 
Date of Birth. 

• Complex Attribute 

• A complex attribute that is both composite and multi 
valued. 


Definition 


• Normalization: The process of decomposing 
unsatisfactory "bad" relations by breaking up their 
attributes into smaller relations 

• Normal form: Condition using keys and FDs of a relation 
to certify whether a relation schema is in a particular 
normal form 



13 


1 st Normal Form 


• First Normal Form 

• Separate Repeating Groups into New Tables. 

• Repeating Groups Fields that may be repeated 
several times for one document/entity 

• Create a new table containing the repeating data 

• The primary key of the table (repeating group) is 
always a composite key; Usually document number 
and a field uniquely describing the repeating line, like 
an item number. 



14 


INF 


• relation is in first normal form if it contains no 
multivalued or composite attributes 

• remove repeating groups to a new table as already 
demonstrated, “carrying” the PK as a FK 

• All columns (fields) must be atomic 

• Means : no repeating items in columns 


2NF 


* a relation is in second normal form if it is in first 
normal form AND every nonkey attribute is fully 
functionally dependant on the primary key 

• i.e. remove partial functional dependencies, so no 
nonkey attribute depends on just part of the key 


EMPL0YEE2 (Emp_ID, Course_Title, Name, 

Address, Salary, Date_Completed) 





r 

Emp 

>_ID 

Course_ 

Title 

Name 

Address 

Salary 

Date_ 

Comp. 


not fully functionally 
dependant on the primary 
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2NF 


• a relation is in 2NF if it is in 1 NF and any OPIG of 
these is true: 

• the PK consists of only 1 attribute 

• all attributes are part of the PK (no nonkey attributes) 

• every non key attribute is functionally dependant on the whole 
PK 
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INF -»2NF 


EMPL0YEE2 (Emp ID. Course Title . Name, 
Address, Salary, Date_Completed) 4 

EMPL0YEE1 (Emp ID , Name, adress, Salary) 

and 

EMP_COURSE (Emp ID. Course Title. 

Date_Completed) 

EMPLOYEE1 satisfies conditionl 

• EMP COURSE satisfies conditions 


3NF 


• a relation is in third normal form if it is in 2NF, AND 
no transitive dependencies exist 

• transitive dependency is a functional dependency 
between nonkey attributes 


transitive dependency 
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Cust_l D 

Name 

city 

Region 




Normalizing an Example Table 


• Unnormalized table: 


Scud en i# Adv iso r Ad v-Roo m Cbssl Class- Cbss3 

1022 Jones 412 101-07 143-01 159-02 

4123 S mith 2 Id 201-01 2 1 1-02 2 14-01 



First Normal Form: No Repeating Group: 

Another way to look at this problem is with a one-to-many 
relationship, do not put the one side and the many side in the 
same table. Instead, create another table in first normal form by 
eliminating the repeating group (Class#), as shown below: 


Stud mt# Advisor Adr-Ram Class? 


1022 

Jones 

412 

101-07 

1022 

Jones 

412 

143-01 

1022 

Jones 

412 

159-02 

4123 

£ mith 

216 

201-01 

4123 

$ mith 

216 

2 1 1-02 

4123 

£ mith 

216 

2 14-01 



Second Normal Form: Eliminate 
Redundant Data 

• Note the multiple Class# values for each Student# value in the 
above table. Class# is not functionally dependent on Student# 
(primary key), so this relationship is not in second normal form. 

• Students 



• Registration 



Stud en tfc C la s s# 

1022 

101-07 

1022 

143-01 

1022 

159-02 

4123 

201-01 

4123 

211-02 

4123 

214-01 
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Third Normal Form: Eliminate Data Not 
Dependent On Key(transitive dependency ) 


• , Adv-Room (the advisor's office number) is functionally 
dependent on the Advisor attribute. The solution is to move that 
attribute from the Students table to the Faculty table, as shown 
below: 


• Students 


Slud en t# Advisor 

1 222 Jonss 

4123 Smith 


• Faculty 


XimeRoomDept 

Jones 412 42 
Smi tli 2 16 42 





Sales Order 


Fiction Company 
202 N. Main 
Mahattan , KS 66502 


C u sto m erN umber: 
Customer Name: 
Customer Address: 


1001 

ABC Company 
100 Points 

Manhattan, KS 66502 


Sales Order Number: 405 

Sales Order Date: 2/1/2000 

Clerk Number: 210 

Clerk Name: Martin Lawrence 


Item Ordered 

Description 

Quantity 

Unit Price 

Total 

800 

widgit small 

40 

60.00 

2,400.00 

801 

tingimajigger 

20 

20.00 

400.00 

805 

thin gib ob 

10 

100.00 

1,000.00 
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Solution 


• SalesOrderNo . Date, CustomerNo, CustomerName, 
CustomerAdd, ClerkNo, ClerkName, ItemNo . 
Description, Qty, UnitPrice 

• First Normal Form 

• Repetition of Data -Header data repeated for every line 
in sales order(ltemNo, Description, Qty, UnitPrice). 
Separate them in a new tabel. 

• The new table is as follows: 

• SalesOrderNo . ItemNo, Description, Qty, UnitPrice 

• The repeating fields will be removed from the original 
data table, leaving the following. 

• SalesOrderNo . Date, CustomerNo, CustomerName, 
CustomerAdd, ClerkNo, ClerkName 
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2 nd Normal Form 


• The new table will contain the following fields 

• ItemNo . Description, UnitPrice 

• SalesOrderNo . ItemNo, Qty 

• SalesOrderNo . Date, CustomerNo, 
CustomerName, CustomerAdd ClerkNo, 
ClerkName 


3 rd Normal Form 


• CustomerNo, CustomerName CustomerAdd 

• ClerkNo. C erkName 

• SalesOrderNo . Date, CustomerNo, ClerkNo 

• Unchanged Tables 

• ItemNo . Description, UnitPrice 

• SalesOrderNo . ItemNo, Qty 



Completed Tables in 3 rd Normal Form 


• Customers: CustomerNo , CustomerName, 
CustomerAdd 

• Clerks: ClerkNo , ClerkName 

• Inventory Items: ItemNo, Description, UnitPrice 

• Sales Orders: SalesOrderNo . Date, CustomerNo 
ClerkNo 

• SalesOrderDetail: SalesOrderNo . ItemNo . Qty, 



Example 


Students Sheet 

Platform Name : SWE Platform Description: Software Engineering 

Graduate Profile: ALL 


Appno 

Name 

F-code 

Faculty 

Major 

Address 

Telno 

Found 

■ 

Grade 

Total 

Att. 

Hrs 

Start 

date 

123 

Ahmed 

SC-phy 

Science 

Physics 

Haram 

338684 

20 

A 

600 

14 Sep 

124 

Mona 

Eng-cs 

Engineering 

Computer 

Dokki 

338974 

55,338 

97445 

B 

591 

15 Sep 

127 

Ali 

Com-ac 

Commerce 

Accounting 

Nasr City 

224159 

39 

A 

550 

21 Sep 

223 

Karim 

Med-bio 

Medicine 

Biochemistry 

Sheraton 

228684 

56 

C 

600 

14 Sep 


1 st Normal Form 


• Normalization: First Normal Form 

• Separate Repeating Groups into New Tables. 

• Repeating Groups Fields that may be repeated 
several times for one document/entity 

• Create a new table containing the repeating data 

• The primary key of the table (repeating group) is 
always a composite key; Usually document number 
and a field uniquely describing the repeating line, like 
an item number. 


INF: 


• Platform : pfname , pfdesc 

• Students: pfname, appno . 
address, Foundgrade, attd 

• Std_Tel: appno, telno 


, pfgraduate 

name , faculty , major 
start date 
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2NF 


• Students: appno . name , f-code .faculty , major 
address 

• Students_pf: pfname, appno . Foundgrade, attd 
start_date 

• Unchanged Tables 

• Platform : pfname , pfdesc , pfgraduate 

• Std_Tel: appno, telno 


3NF 


• Students: appno . name , major, address 

• Fac_majors:faculty , major 

• Unchanged Tables 

• Platform : pfname , pfdesc , pfgraduate 

• Std_Tel: appno, telno 

• Students_pf: pfname.appno . Foundgrade, attd 
start date 


Case Study 


Relation (s#, country, currency, p#, qty) 
where 

s# supplier identifcation number (this is the primary 
key) 

country name of country where supplier is located 
currency: Currency of the country of each supplier 
p# part number of part supplied 
qty quantity of parts supplied to date 

In order to uniquely associate quantity supplied (qty) 
with part (p#) ana supplier (s#), a composite primary 
key composed of s# and p# is used. 


Real World - School Data 


Student 

First 

Parent 1 

Parent 2 

Application No 


Renee 

Ann Jones 

Theodore Smith 

123 


Lucy 

Barbara Mills 

Steve Mills 

558 


Brendan 

Jennifer Jones 

Stephen Jones 

145 

■ ■ ■ 

City 

Postal Code 

Home Phone 

Program 


Annandale 

22003 

(703) 323-0893, 
(703) 3240708 

PI 


Annandale 

22003 

(703) 764-5829 

P2 


Fairfax 

22032 

(703) 978-1083 

P3 

■ ■ ■ 

Enrolled 

Attended/ days 

Birth date 

Previous Teacher 

Current 

Teacher 

96/97 

0 

6/25/1 983 

Hamil 

Burke 

96/97 

0 

8/1 4/1 983 

Hamil 

Burke 

96/97 

0 

6/1 3/1 984 

Hamil 

Burke 
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